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Biologists have devoted much attention to assortative mating or homogamy, the tendency for sexual species 
to mate with similar others. In contrast, there has been little theoretical work on the broader phenomenon of 
homophily, the tendency for individuals to interact with similar others. Yet this behaviour is also widely 
observed in nature. Here, we model how natural selection can give rise to homophily when individuals 
engage in social interaction in a population with multiple observable phenotypes. Payoffs to interactions 
depend on whether or not individuals have the same or different phenotypes, and each individual has a 
preference that determines how likely they are to interact with others of their own phenotype (homophily) or 
of opposite phenotypes (heterophily). The results show that homophily tends to evolve under a wide variety 
of conditions, helping to explain its ubiquity in nature. 

H'omophily, the tendency to interact with others of similar type, is widely observed in nature. Sex- and age- 
related homophily, for example, shapes the formation of clusters of preferred companionships in zebras 1 , 
dolphins 2 , and predicts both the quantity and quality of many primate interactions 3 ' 4 . Meerkats tend to 
assortatively associate with other group members of similar attributes in dominance and foraging networks 5 . And 
across many dimensions of phenotypes, humans exhibit high levels of homophily in social tie formation 6 " 8 . In 
fact, recent evidence suggests that humans may even exhibit genotypic homophily, meaning that individuals with 
a certain genotype are more likely to be friends with others of the same genotype 9 . 

Heterophily, the tendency to interact with others of different type, also exists in nature at both the cellular 1012 
and organismic levels. For example, research on collaboration networks suggests that people are likely to form 
heterophilic task- related ties with those who are complementary to their own skill sets 8 . Analogously, hunter- 
gatherer life is characterised by long-term imbalances in productivity and consumption, and by the division of 
labour 13 ; hence, one might possibly expect that social interactions would, at least in part, be heterophilic, offering 
complementary advantages to interacting parties; but they are not 7 . 

Indeed, heterophily is far less common than homophily. Much effort has focused on examining the functional 
role homophily plays in a wide range of domains, including social segregation 14 , cultural polarization 15 , friendship 
formation 16 , social contagion 17 , and the evolution of cooperation 1819 . And the ubiquity of homophily suggests that 
natural selection may favour it. Yet, to our knowledge, there have to date been no attempts to understand the 
possibly evolutionary origin of this phenomenon. 

Here, we conceptualise the benefits to homophily and heterophily as the results of a simple coordination game. 
For example, homophily may yield fitness advantages because individuals using the same mode of communica- 
tion may be able to act together more effectively. These advantages are sometimes called synergy. On the other 
hand, heterophily may be beneficial because it gives rise to specialisation or gains from trade, such as when a 
farmer interacts with a baker 20 , when different scientists collaborate 21 , or when individuals at different stages in 
the life cycle interact 22 . 

We elaborate a simple model that assigns benefits to interactions and allows individuals to have preferences to 
interact with others with similar or different phenotypes. We then let these preferences co- evolve with the set of 
available phenotypes. This model shows that homophily emerges as the dominant preference under a wide variety 
of conditions. 

For simplicity, suppose there is a haploid asexual population of size N. There are M possible phenotypes (size, 
colour, behaviour, etc.), and each individual i has an observable phenotype denoted by G z e {1,2,..., M}. To be 
sure the phenotypes do not drive the results, we assume that none of these phenotypes alone make individuals 
more or less fit; only combinations of phenotypes between individuals determine fitness. When individuals i and; 
interact, they obtain a homophilic interaction payoff, a, when they are of the same phenotype (G f = G ; ). We can 
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Table 1 | Empirical estimates of homophily in social networks of humans and other animals. Values indicate the likelihood an individual will 
seek a social connection with individuals of the same phenotype (> 0.5 indicates a homophilic preference). For example, when choosing a 
companionship, dolphins prefer associating with others of the same sex rather than with these of the opposite sex. See SI for methods and 



data sources 


Species 


Phenotype 


Mean homophilic preference 


SE 


Humans (adults, U.S.) 


sex 


0.68 


3 x 10" 11 




age 


0.67 


4x 10" 4 


Humans 


sex 


0.54 


3 x 10" 12 


(children, U.S.) 


race 


0.65 


1 x 10" 5 




age 


0.68 


8x 10" 5 


Humans 


sex 


0.59 


8x 10" 12 


(adults and children, India) 


race 


0.66 


1 x 10" 4 




age 


0.53 


1 x 10" 4 


Dolphins 


sex 


0.60 


1 x 10" 10 


Colobus monkeys 


sex 


0.66 


2 x 10" 10 


Zebras 


sex 


0.66 


1 x 10" 10 



think of this as the payoff to synergy. When i and j are of different 
phenotypes (G,- G ; ), they receive a heterophilic interaction payoff, 
b; we can think of this as the payoff to specialisation. We assume that 
when individuals do not interact, their payoff is 0. 

We can describe three kinds of environments from this basic set 
up. When a > b there is an advantage to homophilic interactions. 
When a < b there is an advantage to heterophilic interactions. 
Finally, when a = b there is no advantage to either homophily or 
heterophily. 

At each time period, we assume there is a process that allows each 
individual to interact with another individual. This process proceeds 
in three stages, 1) choose, 2) meet, 3) interact. With probability]?/ e 
[0, 1] , individuals will choose to interact with individuals of the same 
phenotype (homophily) and with probability 1 — p b they will choose 
to interact with individuals of different phenotypes (heterophily). 
Note that p z = 1 means that individual i always chooses individuals 
of the same phenotype and never of the opposite phenotype (perfect 
homophily). Conversely, p f = 0 means that individual i always 
chooses individuals of different phenotypes and never individuals 
of the same phenotype (perfect heterophily). And for intermediate 
values 0 <pi< 1, individuals show a tendency to favour one kind of 
interaction over another. Whenp/ > 0.5, then individual i tends to be 
homophilic, and whenp,- < 0.5, individual i tends to be heterophilic. 

Next, individuals meet. For simplicity, we assume they are ran- 
domly paired with other members of the population (one can ima- 
gine more complex assumptions about meeting - we consider this 
possibility below). Once they meet, they interact, but only if both 
individuals have chosen to initiate an interaction with a partner that 
is compatible with their preferences and phenotypes. Hence, the 
probability of a successful interaction between two individuals i 
and; is pfy if they are of the same phenotype (G,- = G ; ) and (1 — 
pi) (I — pj) if they are of different phenotypes (G f ^ G ; ). 

Results 

One advantage of conceptualising homophily in this way is that we 
can estimate the average homophilic preference p for several species 
and phenotypes in a number of available data sets (Table 1; see also 
SI) assuming that observed interactions are successful matches. Note 
that these real world observations show that homophily (p > 0.5) is 
apparent in all cases, with the estimated value ranging between 0.53 
and 0.68. 

Returning to the model, if we let dy = 1 when G t = G ; and 0 
otherwise, then the expected payoff for each interaction is = 
Sijapipj + (1 — dij)b{\ — pi)(l — pj). And letting x\ be the proportion 
of individuals in the population with preference i and phenotype /, 
the likelihood of each encounter q tj is qy = Sy J2h= i x \ x ) + — ^v) 
J2b= i J2f= \,r±l x \ x ]' If we l et eacn individual in the population ini- 



tiate an interaction, the average expected payoff to individual i is then 

We assume fitness is an exponential function of payoffs 23 , and 
individuals reproduce proportional to their fitness according to a 
frequency- dependent Moran process 24 , which can occur either via 
natural selection (the less fit die and are replaced by the more fit), or 
learning (the less fit copy the preferences of the more fit) 25 " 29 . We also 
allow for mutation. The probability that an offspring changes at 
random to one of the M phenotypes (each with equal likelihood) is 
v, and in expectation v = vN offspring will change. Similarly, the 
probability that an offspring changes to a preference drawn from a 
uniform distribution with support [0, 1] is u, with /n = uN offspring 
changing in expectation. For full details of the model, please refer to 
the SI. 

Before discussing the equilibria of the model, we use a simple 
example to provide an intuition for why there is an advantage to 
homophily. Suppose there are Ni individuals in group 1 with pheno- 
type 1 and N 2 in group 2 with phenotype 2. Further suppose that all 
individuals within each group have the same preference (pi, p 2 , 
respectively). The payoff to an individual in group 1 is then 
[a(Ni - \)p\ + bN 2 (l -pi)(l -pi)]/(N- 1), where N = N, + N 2 . 
Taking two extreme cases, perfect homophilic preferences (pi = 1) 
and perfect heterophilic preferences (pi = 0), we can see that homo- 
philic preferences yield a payoff advantage to individuals in group 1 if 
a > N 2 (l — p 2 )b/(Ni — 1). Remarkably, this equation shows that 
homophily pays even if the benefit, a, to same type interactions is 
lower than the benefit, b, to opposite type interactions. This happens 
when group 1 is in the majority (it is easier to connect to a similar 
phenotype individual in a larger group) and/or when group 2 is also 
homophilic (which reduces the payoff to heterophilic interactions). 

Using coalescent theory 30 , we derive a closed-form solution for 
the limiting distribution of the preferences that survive in the popu- 
lation at equilibrium 31 ' 32 , and we replicate all predictions with com- 
putational simulations. Figure 1 shows an example of three such 
equilibria. 

When the payoff to homophilic interactions is high, those with a 
preference for homophily (highp) are more common than those with 
a preference for heterophily. Compared to a population with prefer- 
ences distributed uniformly at random, the theory establishes a spe- 
cific critical point p c above which individuals are favoured and below 
which they are disfavoured by natural selection. When the benefit 
to interacting with similar and dissimilar others is more in balance, 
a bimodal distribution of preferences emerges with most indivi- 
duals strongly preferring either homophilic or heterophilic interac- 
tions. Finally, when the payoff to specialisation is high, those with a 
preference for heterophily (low p) emerge and are favoured when 
they have a preference below a critical point p c . 
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Figure 1 | Equilibrium distribution of homophilic preferences, p, and the networks that result. Solid blue lines in (a), (b), and (c) represent the 
theoretical distributions. Arrows show the theoretical critical point p c where the distribution crosses the uniform distribution {i.e. y becomes more 
common or less common than expected due to chance), (d), (e), display networks of homophilic and heterophilic structures of social relationships that 
emerge from computer simulations of (a) and (c), respectively. Parameters: N = 30, [5 = 0.005, M= 3, u = 0.04, v = 0.06, (a, d) a = l,b = 0.1, (b) a = 5/9, 
b = 1, (c, e) a = 0.1, b = 1. Results are averaged over T = 10 9 time steps. 



While these results may seem unsurprising, Figure 2 shows that 
the theory generates an elegant critical threshold a> Kb that deter- 
mines whether the average individual in the population will evolve to 
become homophilic. The slope of this threshold is: 

K= vQ* + v + 2)(M-l) 

v(> + v + 2) + 0 + 2v + 3)M' 

where M is the number of possible phenotypes, and \i and v are 
mutation rates that are rescaled by the population size N, \i = uN 
and v = vN. 

An important implication of this condition is that homophily can 
evolve even when the benefit b to heterophily (specialisation) exceeds 
the benefit a to homophily (synergy). This happens when K < 1. For 
example, Figure 2 shows that decreasing the number of phenotypes in 
the population (M), the phenotypic mutation rate (v), or the pref- 
erence mutation rate (fi), all decrease K and in turn increase the range 
of values under which homophily can emerge. Figure 3 shows that the 
average individual in the population becomes less homophilic as we 
increase each of these three parameters. And, in Figure 4, we show the 
full set of conditions under which homophily evolves for a given 
population size. We emphasise that homophily evolves even if a <C 
b as long as the (phenotypic) mutation rates are sufficiently low. 

We studied the behaviour of the critical point p c (see SI) and found a 
surprisingly strong tendency toward homophily when mutation rates 
are low. In fact, when we let both v and fi approach 0, the model shows 
that natural selection favours all individuals with p > 1 / y/3. In other 
words, all homophilic individuals and even those individuals that 
weakly prefer heterophily tend to do better than those that strongly 
prefer heterophily. And this result is independent of the payoff to 



heterophilic interactions, b, and phenotypic diversity, M. The only 
requirement is that the payoff to homophilic interactions, a, be positive. 

Given constraints on mobility, it may be unrealistic to assume that 
individuals are equally likely to meet with all other members of the 
population 733 . In many species (including humans) interactions tend 
to be more likely between individuals of the same type because they 
are drawn to and/or drawn from similar environments. We therefore 
extended the basic model by introducing an additional parameter to 
allow for such assortativity. With probability 0 < <j> < 1, individuals 
interact with others of the same phenotype, and with probability 1 — 
(j) they interact as before (see SI). 

As it turns out, this extension generates an almost identical critical 
threshold, with homophily emerging when a > (1 — (j))Kb. Since <j> < 
1, this result shows that any degree of assortative matching that 
brings similar types into contact with one another more often makes 
it even more likely that homophily will evolve. 

Another strong assumption of the basic model is that increasing a 
preference to interact with individuals of the same phenotype (p) 
yields an equal and opposite decrease in the preference to interact 
with individuals of different phenotypes (1 — p). Yet some indivi- 
duals might want to interact with both similar and different pheno- 
types. We relaxed this assumption by allowing a separate preference 
q (independent of the preference p) to evolve that indicates the prob- 
ability that an individual interacts with individuals of different phe- 
notypes. Thus, each individual was characterised by a triplet {G b p b 
q t } denoting phenotype, homophilic preference, and heterophilic 
preference, respectively. 

In this extended version of the model (see SI), we found that the 
average homophilic preference (p) was greater than the average het- 
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I homophily 



0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 




heterophily 



0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 



Payoff to heterophilic interaction, b 

Figure 2 | Evolutionary determinants of homophily. Shown are the population average (p) as a function of the payoff values a and b. Compared with the 
base case in (a), decreases in (b) the number of phenotypes, M, (c) the strategy mutation rate, u, and (d) the phenotypic mutation rate, v, always make it 
easier for homophily to evolve. For each b value, squares denote the critical a values determined by simulations. The straight lines are the theoretical 
predictions. We find good agreement between simulation results and our analytical theory. Parameters: N = 50, /? = 0.005, (a) M = 4, u = 0.06, v = 0.06, 
(b) M = 2, u = 0.06, v = 0.06, (c) M = 4, u = 0.01, v = 0.06, (d) M = 4, u = 0.06, v = 0.02. Results are averaged over T = 10 8 time steps. 



erophilic preference (q) in equilibrium if and only if a > Kb. Note 
that this is exactly the same condition that results from the basic 
model. Moreover, under the same condition (a > Kb), the popu- 
lation consists of more homophilic individuals (p > q) than the 
counterpart, heterophilic ones (p < q). In other words, the popu- 
lation tends to show more homophily than heterophily. 

Finally, in all models, we assume natural selection is weak, which 
means that fitness differences are very small. Increasing the strength 
of natural selection relaxes this assumption and magnifies the fitness 
difference between traits. As a result, the evolutionary dynamics 
become increasingly deterministic, and the critical point p c and the 



distribution of preferences in equilibrium become more skewed 
towards the extremes (p = 0 andp =1). 

Discussion 

Our model differs from these prior studies that have taken into 
account assortativity in mating choice or ecological competition inte- 
ractions. Among them, one study 34 concerns whether a particular 
assortative mating choice of females is favoured for a fixed com- 
position of phenotypes in the population. In this model, the ma- 
ting choice is exclusively unilateral and up to females, and the 
derived condition only gives the direction of the evolution at a given 



0.53 




0.53 



2x10° 10 1 10 2 

Number of phenotypes, M 



0.52 
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0.55 



10~ 2 NT 1 10° 

Strategy mutation rate, u 



0.50 



0.45 



10" 2 10 _1 10° 

Phenotypic mutation rate, v 



Figure 3 | Population average of homophilic preferences, (p). Shown are the population average (p) as a function of (a) the number of phenotypes, M, 
(b) the strategy mutation rate, u y and (c) the phenotypic mutation rate, v. The filled circles are simulation results, which agree well with theoretical 
predictions (solid lines). As M, u, v increase, the population becomes less homophilic. Parameters: N = 50, ft = 0.002, a = 3/4, b = 1, (a) u = 0.04, v = 
0.02, (b) M = 5, v = 0.02, (c) M = 5, u = 0.04. Results are averaged over T = 5 X 10 8 time steps. 
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Figure 4 | The full set of conditions under which homophily evolves. The coloured 3D regions in (a), (c) show the combinations of parameters (fi, v, M) 
that allow natural selection to favour homophily for given payoff values: (a) a = 3/4, b = 1 and (c) a = 1/10, b = 1. Corresponding to (a) and (c) 
respectively, the shaded areas in (b), (d) denote the set of parameters (//, v) favouring homophily with the number of phenotypes, M = 2. The boundaries 
of these shaded areas shrink to the dashed red lines, as M increases to °°. Homophily evolves even if a <C b y as long as the mutation rates are low. 



population composition. In contrast, interactions in our model are 
based on bilateral agreement, individuals' preferences and pheno- 
types are allowed to co-evolve, and most importantly, the mutation- 
selection equilibrium is analytically derived that explicitly accounts 
for the abundance of any homophilic trait in the long run evolution. 
Some other previous studies, for example, consider the role of pre- 
existing assortative competition, but rather than the emergence of 
assortativity per se, in asexual 35 or sexual 36 selection contexts. 
Complementing and strengthening these prior works 34 " 36 , our model 
here explicitly addresses how homophily evolves in the first place. 

Furthermore, our evolutionary model shows how homophily can 
emerge under a wide variety of conditions, particularly when muta- 
tion rates are low. It is not surprising that the payoff to homophilic 
interactions must be high relative to the payoff to heterophilic inter- 
actions in order for homophily to evolve. However, the analysis also 
suggests that this relationship is only relative, since the threshold for 
the ratio between these benefits can be less than one. This means that 
synergy may have a powerful effect on evolution, even when there are 
substantial benefits to specialization, helping to explain the ubiquity 
of homophily in nature. In higher-order species, the emergence of 
language or other forms of communication, or of certain cognitive 
capacities, might serve such a function, and may help to promote a 
general tendency to seek out similar individuals with whom to coop- 
erate or interact. 

The model also shows that even small advantages to synergy can 
significantly reduce phenotypic diversity. Heterophilic populations 
maintain diversity by privileging rare phenotypes, generally causing 
their distribution to become uniform. Homophilic populations, on 



the other hand, privilege common phenotypes, helping to drive 
alternative phenotypes to extinction. Even if all phenotypes are them- 
selves fitness neutral (as they are in our model), advantages to synergy 
will tend to yield populations dominated by a single phenotype, and 
in the long run the population will tend to oscillate from one dom- 
inant phenotype to another, with rapid phase transitions in between. 

Our results may also shed light on the observation that evolution 
in humans is accelerating 37 . The human capacity to collaborate not 
only with kin but also with unrelated members of our species may 
have dramatically increased the potential gains from synergy, and 
this shift in payoffs would not only favour interactions with similar 
partners, but would also affect the overall desire to search out such 
partners. A wide variety of studies suggest that humans particularly 
seek out similar others 6 ' 38 , even when there are no obvious benefits to 
these interactions 39 (on the contrary, there may be more benefits to 
specialisation). Hence, it is possible that we evolved a strong pre- 
dilection for homophily once we started to interact frequently with 
unrelated individuals. Such an effect would especially accelerate the 
evolution of phenotypes that are intrinsically synergistic, such as 
those related to communication or other collaborative activities. 

Here, we focus on only one phenotypic dimension, but it is pos- 
sible to extend our model to multiple dimensions of phenotypes (see 
SI). Payoffs to synergy and specialisation may vary for different sets 
of phenotypes, as each set serves a different function in social inter- 
actions. Therefore, multi-layer networks superimposed on various 
types of social interactions could result, and each layer could play a 
different role. It is possible that in one layer 'similarity attracts', 
leading to stable, long-term relationships 6 , while in another layer 
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'opposite attracts', which often results in short-lived task-related 
ties 8 . In this way, natural selection could lead to the delicate, if very 
lopsided, balance between homophily and heterophily that we 
observe today in the real world. 

Methods 

In the Supplementary Information, we elaborate on the model presented above and 
analyse it, characterising equilibria under mutation- selection, explaining selection 
criteria, deriving a formula for continuous preferences, and showing long form 
equations for the triplet correlations at neutrality that allow us to use coalescent 
theory to derive conditions for the evolution of homophily. We then describe how we 
derived the critical slopes and critical preferences shown in the main text, and 
characterise equilibria for the whole parameter space. Finally, we turn our attention to 
several extensions of the model, analytically characterising models with biased 
matching, local mutation, strong selection, full strategy space, and multiple sets of 
phenotypes. 

We also describe the application of the model to derive estimates of homophily 
from empirical data, as shown in Table 1 . 
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